The C Programming Language Notes
Table of Contents
The C Programming Language, second edition by Kernighan and Ritchie.
Other people working through the book may be interested in the Notes to Accompany The C Programming Language by Steve Summit, maintainer of the comp.lang.c FAQ. Also be aware of the errata.
Answers
- There is a book, The C Answer Book by Tondo and Gimpel, that has answers to these exercises.
- The answer by bamsoftware.
- Another site (based on an older one) that has answers to most of the exercises.
- My answers: tcpl-answers-shi.tar.gz
Chapter 1
- page 5
The only way to learn a new programming language is by writing programs in it.
- page 11
you must match up the arguments to printf with the conversion specification; the compiler can't (or won't) generally check them for you or fix things up if you get them wrong. If fahr is a float, the code
printf("%d\n", fahr);
will not work. You might ask, ``Can't the compiler see that %d needs an integer and fahr is floating-point and do the conversion automatically, just like in the assignments and comparisons on page 12?'' And the answer is, no. As far as the compiler knows, you've just passed a character string and some other arguments to printf; it doesn't know that there's a connection between the arguments and some special characters inside the string. This is one of the implications of the fact, stated earlier, that functions like printf are not special.
- page 15
Notice that there is no semicolon at the end of a #define line.
Actually, all lines that begin with # are special; we'll learn more about them later.
- page 16
int c; c = getchar();
We must declare
c
to be a type big enough to hold any value thatgetchar
returns. We can't usechar
sincec
must be big enough to holdEOF
in addition to any possiblechar
. Therefore we useint
. - page 17
The line
while ((c = getchar()) !
EOF)= epitomizes the cryptic brevity which C is notorious for.We have four things to do:
- call
getchar
, - assign its return value to a variable,
- test the return value against
EOF
, and - process the character (in this case, print it again).
how do you send EOF? The answer depends on what kind of computer you're using. On Unix and Unix-related systems, it's almost always control-D. On MS-DOS machines, it's control-Z followed by the RETURN key. Under Think C on the Macintosh, it's control-D, just like Unix. On other systems, you may have to do some research to learn how to send EOF.
- call
Chapter 2
- page 39
Be careful to distinguish between a character constant and a string that contains a single character: 'x' is not the same as "x". The former is an integer used to produce the numeric value of the letter x in the machine;s character set. The latter is an array of characters that contains one character( the letter x) and a '\0'.
Name in different enumerations must be distinct. Values need not be distinct in the same enumeration.
- page 40
External and static variables are initialized to zero by default.
- page 41
The % operator cannot be applied to
float
ordouble
. The direction of truncation for / and the sign of the result for % are machine-dependent for negative operands, as is the action taken on overflow or underflow. This means that -7 / 4 might be -1 or -2, and -7 % 4 might be -3 or +1. - page 44
the lower type is promoted to the higher type, where the order of the types is
char < short int < int < long int < float < double < long double
- page 45
Casts can be a bit confusing at first. A cast is the syntax used to request an explicit type conversion; coercion is just a more formal word for ``conversion.''
- page 46
The authors point out that an expression like (i+j)++ is illegal, and it's worth thinking for a moment about why. The ++ operator doesn't just mean ``add one''; it means ``add one to a variable'' or ``make a variable's value one more than it was before.'' But (i+j) is not a variable, it's an expression; so there's no place for ++ to store the incremented result.
- page 52 Precedence and Order of Evaluation
To look at one more example, it might seem that the code
int i = 7; printf("%d\n", i++ * i++);
would have to print 56, because no matter which order the increments happen in, 7x8 is 8x7 is 56. But ++ just says that the increment happens later, not that it happens immediately, so this code could print 49 (if it chose to perform the multiplication first, and both increments later). And, it turns out that ambiguous expressions like this are such a bad idea that the ANSI C Standard does not require compilers to do anything reasonable with them at all, such that the above code might end up printing 42, or 8923409342, or 0, or crashing your computer.
Chapter 4
- page 77
push(pop() - pop()); /* WRONG */
Because + and * are commutative operations, the order in which the popped operands are combined is irrelevant, but for - and / the left and right operands must be distinguished. In the above, the order in which the two calls of
pop
are evaluated is not defined. To guarantee the right order, it is necessary to pop the first value into a temporary variable as we did in =
Chapter 5
- page 96
the swap example on page 96 does work.
void swap1(int *q1,int *q2) { int temp; temp = *q1; *q1 = *q2; *q2 = temp; }
But this swap does not work.
void swap2(int *q1, int *q2) { int *temp; temp = q1; q1 = q2; q2 = temp; }
- page 99
This explains, among other things, how getline and getop were able to modify the arrays in the caller, even though we said that call-by-value meant that functions can't modify variables in their callers since they receive copies of the parameters. When a function receives a pointer, it cannot modify the original pointer in the caller, but it can definitely modify what the pointer points to.
- page 104
As an example of what we can and can't do, given the lines
char amessage[] = "now is the time"; char *pmessage = "now is the time";
we could say
amessage[0] = 'N';
to make amessage say "Now is the time". But if we tried to do
pmessage[0] = 'N';
(which, as you may recall, is equivalent to =*pmessage = 'N'=), it would not necessarily work; we're not allowed to modify that string. (One reason is that the compiler might have placed the ``little block of characters'' in read-only memory. Another reason is that if we had written
char *pmessage = "now is the time"; char *qmessage = "now is the time";
the compiler might have used the same little block of memory to initialize both pointers, and we wouldn't want a change to one to alter the other.)
- page 112
More generally, only the first dimension (subscript) of an array is free; all the others have to be specified.
This just says what we said already: when declaring an array as a function parameter, you can leave off the first dimension because it is the overall length and not knowing it causes no immediate problems (unless you accidentally go off the end). But the compiler always needs to know the other dimensions, so that it knows how the rown and columns line up.
If a two-dimensional array is to be passed to a function, the parameter declaration in the function must include the number of columns; the number of rows i irrelevant, since what is passed is, as before, a pointer to an array of rows, where each row is an array of 13 ints.
- page 120
The generic pointer type void * is used for the pointer arguments. Any pointer can be cast to void * and back again without loss of information, so we can call qsort by casting arguments to void *.
Chapter 6
- page 135
sizeof
returns the size counted in bytes, where the C definition of ``byte'' is ``the size of a char.'' In other words,sizeof(char)
is always 1. (It turns out that it's not necessarily the case, though, that a byte or a char is 8 bits.)Strictly,
sizeof
produces an unsigned integer value whose type,size_t
, is defined in the header<stddef.h>
.A
sizeof
can not be used in a#if
line, because the preprocessor does not parse type names. But the expression in the#define
is not evaluated by the preprocessor. - page 138
Don't assume, however, that the size of a structure is the sum of the sizes of its members.
If this isn't the sort of thing you'd be likely to assume, you don't have to remember the reason, which is mildly esoteric (having to do with memory alignment requirements).
- page 142
We can test for this in two different ways, either before or after we make the ``last'' recursive call:
void treeprint(struct tnode *p) { if(p->left != NULL) treeprint(p->left); printf("%4d %s\n", p->count, p->word); if(p->right != NULL) treeprint(p->right); }
or
void treeprint(struct tnode *p) { if(p == NULL) return; treeprint(p->left); printf("%4d %s\n", p->count, p->word); treeprint(p->right); }
Sometimes, there's little difference between one approach and the other. Here, though, the second approach (which is equivalent to the code on page 142) has a distinct advantage: it will work even if the very first call is on an empty tree (in this case, if there were no words in the input). As we mentioned earlier, it's extremely nice if programs work well at their boundary conditions, even if we don't think those conditions are likely to occur.
- page 143
char str[] = "a"; sizeof(str); /*2*/ strlen(str); /*1*/
- page 146
In effect,
typedef
is like#define
, except that since it is interpreted by the computer, it can cope with textual substitutions that are beyond the capabilities of the preprocessor. For example,typedef int (*PFI)(char *, char *);
creates the type PFI, for "pointer to function (of two char * arguments) returning int," which can be used in contexts like
PFI strcmp, numcmp;
in the sort program.
Chapter 7
- page 152
Note that the stdio library generally does newline translation for you. If you know that lines are terminated by a linefeed on Unix and a carriage return on the Macintosh and a carriage-return/linefeed combination on MS-DOS, you don't have to worry about these things in C, because the line termination will always appear to a C program to be a single '\n'.
- page 154
A width or precision may be specified as *, in which case the value is computed by converting the next argument (which must be an int). For example, to print at most max characters from a string s,
prinf("%.*s", max, s);
- page 164
Confusingly,
gets
deletes the terminals '\n', andputs
adds it. - page 166
strstr(s,t) return pointer to first t in s, or NULL if not present
- page 168(Random Number Gneration)
the code for returning a floating-point random number in the interval [0,1) should be
#define frand() ((double) rand() / (RAND_MAX+1.0))
If you want to get random integers from M to N, you can use something likeM + (int)(frand() * (N-M+1))
``[Setting] the seed for rand'' refers to the fact that, by default, the sequence of pseudo-random numbers returned by rand is the same each time your program runs. To randomize it, you can call srand at the beginning of the program, handing it some truly random number, such as a value having to do with the time of day. (One way is with code like
#include <stdlib.h> #include <time.h> srand((unsigned int)time((time_t *)NULL));
which uses the time function mentioned on page 256 in appendix B10.)
One other caveat about rand: don't try to generate random 0/1 values (to simulate a coin flip, perhaps) with code like
rand() % 2
This looks like it ought to work, but it turns out that on some systems rand isn't always perfectly random, and returns values which consistently alternate even, odd, even, odd, etc. (In fact, for similar reasons, you shouldn't usually use rand() % N for any value of N.) A good way to get random 0/1 values would be(int)(frand() * 2)
based on the other frand() examples above.
Chapter 8
Appendix A: Reference Manual
- A2.1 Tokens
There are six classes of tokens: identifiers, keywords, constants, string literals, operators, and other separators.
- A.2.5.2 Character Constants
Name Abstract character Name Abstract Character newline NT(LF) \n backslash \ \\ horizontal tab HT \t question mark ? \? vertical VT \v single quote ' \' backspace BS \b double quote " \" carriage return CR \r octal number ooo \ooo audible BEL \a - A.8.6.1 Pointer Declarators
In a declaration T D where D has the form
*
type-qualifier-listopt D1 and the type of the identifier in the declaration T D1 is ``type-modifier T,'' the type of the identifier of D is ``type-modifier type-qualifier-list pointer to T.'' Qualifiers following * apply to pointer itself, rather than to the object to which the pointer points.For example, consider the declaration int *ap[];
Here, ap[] plays the role of D1; a declaration ``int ap[]'' (below) would give ap the type ``array of int,'' the type-qualifier list is empty, and the type-modifier is ``array of.'' Hence the actual declaration gives ap the type ``array to pointers to int.''
- A.8.8 Type names
int int * int *[3] int (*)[] int *() int (*[])(void)
name respectively the types ``integer,'' ``pointer to integer,'' ``array of 3 pointers to integers,'' ``pointer to an unspecified number of integers,'' ``function of unspecified parameters returning pointer to integer,'' and ``array, of unspecified size, of pointers to functions with no parameters each returning an integer.''
- A.12.3 Macro Definition and Expansion
First, if an occurrence of a parameter in the replacement token sequence is immediately preceded by #, string quotes (") are placed around the corresponding parameter, and then both the # and the parameter identifier are replaced by the quoted argument. A \ character is inserted before each " or \ character that appears surrounding, or inside, a string literal or character constant in the argument.
Second, if the definition token sequence for either kind of macro contains a ## operator, then just after replacement of the parameters, each ## is deleted, together with any white space on either side, so as to concatenate the adjacent tokens and form a new token. The effect is undefined if invalid tokens are produced, or if the result depends on the order of processing of the ## operators. Also, ## may not appear at the beginning or end of a replacement token sequence.
Given the definition
#define tempfile(dir) #dir "%s"
the macro call
tempfile(/usr/tmp)
yields"/usr/tmp" "%s"
which will subsequently be catenated into a single string. After
#define cat(x, y) x ## y
the call
cat(var, 123)
yields var123. However, the callcat(cat(1,2),3)
is undefined: the presence of ## prevents the arguments of the outer call from being expanded. Thus it produces the token stringcat ( 1 , 2 )3
and )3 (the catenation of the last token of the first argument with the first token of the second) is not a legal token. If a second level of macro definition is introduced,
#define xcat(x, y) cat(x,y)
things work more smoothly;
xcat(xcat(1, 2), 3)
does produce 123, because the expansion of xcat itself does not involve the ## operator. - A.12.7 Error Generation
A preprocessor line of the form
causes the preprocessor to write a diagnostic message that includes the token sequence.
- A.12.8 Pragmas
A control line of the form
causes the preprocessor to perform an implementation-dependent action. An unrecognized pragma is ignored.
- A.12.10 Predefined names
Several identifiers are predefined, and expand to produce special information. They, and also the preprocessor expansion operator defined, may not be undefined or redefined.
- LINE A decimal constant containing the current source line number.
- FILE A string literal containing the name of the file being compiled.
- DATE A string literal containing the date of compilation, in the form "Mmmm dd yyyy"
- TIME A string literal containing the time of compilation, in the form "hh:mm:ss"
- STDC The constant 1. It is intended that this identifier be defined to be 1 only in standard-conforming implementations.
Appendix B - Standard Library
<assert.h> <ctype.h> <errno.h> <float.h> <limits.h>
<locale.h> <math.h> <setjmp.h> <signal.h> <stdarg.h>
<stddef.h> <stdio.h> <stdlib.h> <string.h> <time.h>
- B.2 Character Class Tests: <ctype.h>
isalnum(c) isalpha(c) or isdigit(c) is true isalpha(c) isupper(c) or islower(c) is true iscntrl(c) control character isdigit(c) decimal digit isgraph(c) printing character except space islower(c) lower-case letter isprint(c) printing character including space ispunct(c) printing character except space or letter or digit isspace(c) space, formfeed, newline, carriage return, tab, vertical tab isupper(c) upper-case letter isxdigit(c) hexadecimal digit int tolower(c) convert c to lower case int toupper(c) convert c to upper case
- B.5 Utility Functions: <stdlib.h>
+ double atof(const char *s) atof converts s to double; it is equivalent to strtod(s, (char**)NULL). + int atoi(const char *s) converts s to int; it is equivalent to (int)strtol(s, (char**)NULL, 10). + long atol(const char *s) converts s to long; it is equivalent to strtol(s, (char**)NULL, 10). + double strtod(const char *s, char **endp) strtod converts the prefix of s to double, ignoring leading white space; it stores a pointer to any unconverted suffix in *endp unless endp is NULL. If the answer would overflow, HUGE_VAL is returned with the proper sign; if the answer would underflow, zero is returned. In either case errno is set to ERANGE. + long strtol(const char *s, char **endp, int base) strtol converts the prefix of s to long, ignoring leading white space; it stores a pointer to any unconverted suffix in *endp unless endp is NULL. If base is between 2 and 36, conversion is done assuming that the input is written in that base. If base is zero, the base is 8, 10, or 16; leading 0 implies octal and leading 0x or 0X hexadecimal. Letters in either case represent digits from 10 to base-1; a leading 0x or 0X is permitted in base 16. If the answer would overflow, LONG_MAX or LONG_MIN is returned, depending on the sign of the result, and errno is set to ERANGE. + unsigned long strtoul(const char *s, char **endp, int base) strtoul is the same as strtol except that the result is unsigned long and the error value is ULONG_MAX. + int rand(void) rand returns a pseudo-random integer in the range 0 to RAND_MAX, which is at least 32767. + void srand(unsigned int seed) srand uses seed as the seed for a new sequence of pseudo-random numbers. The initial seed is 1. + void *calloc(size_t nobj, size_t size) calloc returns a pointer to space for an array of nobj objects, each of size size, or NULL if the request cannot be satisfied. The space is initialized to zero bytes. + void *malloc(size_t size) malloc returns a pointer to space for an object of size size, or NULL if the request cannot be satisfied. The space is uninitialized. + void *realloc(void *p, size_t size) realloc changes the size of the object pointed to by p to size. The contents will be unchanged up to the minimum of the old and new sizes. If the new size is larger, the new space is uninitialized. realloc returns a pointer to the new space, or NULL if the request cannot be satisfied, in which case *p is unchanged. + void free(void *p) free deallocates the space pointed to by p; it does nothing if p is NULL. p must be a pointer to space previously allocated by calloc, malloc, or realloc. + void abort(void) abort causes the program to terminate abnormally, as if by raise(SIGABRT). + void exit(int status) exit causes normal program termination. atexit functions are called in reverse order of registration, open files are flushed, open streams are closed, and control is returned to the environment. How status is returned to the environment is implementation- dependent, but zero is taken as successful termination. The values EXIT_SUCCESS and EXIT_FAILURE may also be used. + int atexit(void (*fcn)(void)) atexit registers the function fcn to be called when the program terminates normally; it returns non-zero if the registration cannot be made. + int system(const char *s) system passes the string s to the environment for execution. If s is NULL, system returns non-zero if there is a command processor. If s is not NULL, the return value is implementation-dependent. + char *getenv(const char *name) getenv returns the environment string associated with name, or NULL if no string exists. Details are implementation-dependent. + void *bsearch(const void *key, const void *base, size_t n, size_t size, int (*cmp)(const void *keyval, const void *datum)) bsearch searches base[0]...base[n-1] for an item that matches *key. The function cmp must return negative if its first argument (the search key) is less than its second (a table entry), zero if equal, and positive if greater. Items in the array base must be in ascending order. bsearch returns a pointer to a matching item, or NULL if none exists. + void qsort(void *base, size_t n, size_t size, int (*cmp)(const void *, const void *)) qsort sorts into ascending order an array base[0]...base[n-1] of objects of size size. The comparison function cmp is as in bsearch. + int abs(int n) abs returns the absolute value of its int argument. + long labs(long n) labs returns the absolute value of its long argument. + div_t div(int num, int denom) div computes the quotient and remainder of num/denom. The results are stored in the int members quot and rem of a structure of type div_t. + ldiv_t ldiv(long num, long denom) ldiv computes the quotient and remainder of num/denom. The results are stored in the long members quot and rem of a structure of type ldiv_t
- B.7 Variable Argument Lists: <stdarg.h>
The header <stdarg.h> provides facilities for stepping through a list of function arguments of unknown number and type.
va_list ap; va_start(va_list ap, lastarg); type va_arg(va_list ap, type); void va_end(va_list ap);
- B.8 Non-local Jumps: <setjmp.h>
The declarations in <setjmp.h> provide a way to avoid the normal function call and return sequence, typically to permit an immediate return from a deeply nested function call.
- int setjmp(jmpbuf env)
The macro setjmp saves state information in env for use by longjmp. The return is zero from a direct call of setjmp, and non-zero from a subsequent call of longjmp. A call to setjmp can only occur in certain contexts, basically the test of if, switch, and loops, and only in simple relational expressions.
if (setjmp(env) == 0) /* get here on direct call */ else /* get here by calling longjmp */
- void longjmp(jmpbuf env, int val)
longjmp restores the state saved by the most recent call to setjmp, using the information saved in env, and execution resumes as if the setjmp function had just executed and returned the non-zero value val. The function containing the setjmp must not have terminated. Accessible objects have the values they had at the time longjmp was called, except that non-volatile automatic variables in the function calling setjmp become undefined if they were changed after the setjmp call.
- B.9 Signals: <signal.h>
The header <signal.h> provides facilities for handling exceptional conditions that arise during execution, such as an interrupt signal from an external source or an error in execution.
void (*signal(int sig, void (*handler)(int)))(int)
signal determines how subsequent signals will be handled. If handler is SIGDFL, the implementation-defined default behavior is used, if it is SIGIGN, the signal is ignored; otherwise, the function pointed to by handler will be called, with the argument of the type of signal. Valid signals include
SIGABRT abnormal termination, e.g., from abort
SIGFPE arithmetic error, e.g., zero divide or overflow
SIGILL illegal function image, e.g., illegal instruction
SIGINT interactive attention, e.g., interrupt
SIGSEGV illegal storage access, e.g., access outside memory limits
SIGTERM termination request sent to this program
signal returns the previous value of handler for the specific signal, or SIGERR if an error occurs.
The initial state of signals is implementation-defined.
int raise(int sig)
raise sends the signal sig to the program; it returns non-zero if unsuccessful. - B.11 Implementation-defined Limits: <limits.h> and <float.h>
The header <limits.h> defines constants for the sizes of integral types. The values below are acceptable minimum magnitudes; larger values may be used.
CHAR_BIT 8 bits in a char CHAR_MAX UCHAR_MAX or SCHAR_MAX maximum value of char CHAR_MIN 0 or SCHAR_MIN maximum value of char INT_MAX 32767 maximum value of int INT_MIN -32767 minimum value of int LONG_MAX 2147483647 maximum value of long LONG_MIN -2147483647 minimum value of long SCHAR_MAX +127 maximum value of signed char SCHAR_MIN -127 minimum value of signed char SHRT_MAX +32767 maximum value of short SHRT_MIN -32767 minimum value of short UCHAR_MAX 255 maximum value of unsigned char UINT_MAX 65535 maximum value of unsigned int ULONG_MAX 4294967295 maximum value of unsigned long USHRT_MAX 65535 maximum value of unsigned short
The names in the table below, a subset of <float.h>, are constants related to floating-point arithmetic. When a value is given, it represents the minimum magnitude for the corresponding quantity. Each implementation defines appropriate values.
FLT_RADIX 2 radix of exponent, representation, e.g., 2, 16 FLT_ROUNDS floating-point rounding mode for addition FLT_DIG 6 decimal digits of precision FLT_EPSILON 1E-5 smallest number x such that 1.0+x != 1.0 FLT_MANT_DIG number of base FLT_RADIX in mantissa FLT_MAX 1E+37 maximum floating-point number FLT_MAX_EXP maximum n such that FLT_RADIX(n-1) is representable FLT_MIN 1E-37 minimum normalized floating-point number FLT_MIN_EXP minimum n such that 10(n) is a normalized number DBL_DIG 10 decimal digits of precision DBL_EPSILON 1E-9 smallest number x such that 1.0+x != 1.0 DBL_MANT_DIG number of base FLT_RADIX in mantissa DBL_MAX 1E+37 maximum double floating-point number DBL_MAX_EXP maximum n such that FLT_RADIX(n-1) is representable DBL_MIN 1E-37 minimum normalized double floating-point number DBL_MIN_EXP minimum normalized double floating-point number